Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 69
Filter
1.
Proc Natl Acad Sci U S A ; 121(19): e2315780121, 2024 May 07.
Article in English | MEDLINE | ID: mdl-38687793

ABSTRACT

Measuring inbreeding and its consequences on fitness is central for many areas in biology including human genetics and the conservation of endangered species. However, there is no consensus on the best method, neither for quantification of inbreeding itself nor for the model to estimate its effect on specific traits. We simulated traits based on simulated genomes from a large pedigree and empirical whole-genome sequences of human data from populations with various sizes and structures (from the 1,000 Genomes project). We compare the ability of various inbreeding coefficients ([Formula: see text]) to quantify the strength of inbreeding depression: allele-sharing, two versions of the correlation of uniting gametes which differ in the weight they attribute to each locus and two identical-by-descent segments-based estimators. We also compare two models: the standard linear model and a linear mixed model (LMM) including a genetic relatedness matrix (GRM) as random effect to account for the nonindependence of observations. We find LMMs give better results in scenarios with population or family structure. Within the LMM, we compare three different GRMs and show that in homogeneous populations, there is little difference among the different [Formula: see text] and GRM for inbreeding depression quantification. However, as soon as a strong population or family structure is present, the strength of inbreeding depression can be most efficiently estimated only if i) the phenotypes are regressed on [Formula: see text] based on a weighted version of the correlation of uniting gametes, giving more weight to common alleles and ii) with the GRM obtained from an allele-sharing relatedness estimator.


Subject(s)
Inbreeding Depression , Models, Genetic , Humans , Pedigree , Genetics, Population/methods , Inbreeding , Alleles
2.
PLoS Genet ; 20(2): e1011133, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38412146

ABSTRACT

[This corrects the article DOI: 10.1371/journal.pgen.1010871.].

3.
Forensic Sci Int Genet ; 69: 103009, 2024 03.
Article in English | MEDLINE | ID: mdl-38237274

ABSTRACT

Population data have become available for sequence data to aid forensic investigations and prepare the forensic community in the move towards implementing NGS methods. This comes with a need for updated population genetic parameters estimates to allow DNA evidence evaluations using sequence data. Initial work has been done on a small sample and here we expand this work by providing estimates of population structure and relatedness for autosomal STR data generated by sequencing technologies. We also discuss the effect of inbreeding on forensic calculations and discuss why the use of genotypic-based estimates may be preferred over allelic-based estimates.


Subject(s)
Forensic Genetics , Inbreeding , Humans , Forensic Genetics/methods , Microsatellite Repeats , Genotype , DNA/genetics , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , DNA Fingerprinting/methods
4.
PLoS Genet ; 19(11): e1010871, 2023 Nov.
Article in English | MEDLINE | ID: mdl-38011288

ABSTRACT

Being able to properly quantify genetic differentiation is key to understanding the evolutionary potential of a species. One central parameter in this context is FST, the mean coancestry within populations relative to the mean coancestry between populations. Researchers have been estimating FST globally or between pairs of populations for a long time. More recently, it has been proposed to estimate population-specific FST values, and population-pair mean relative coancestry. Here, we review the several definitions and estimation methods of FST, and stress that they provide values relative to a reference population. We show the good statistical properties of an allele-sharing, method of moments based estimator of FST (global, population-specific and population-pair) under a very general model of population structure. We point to the limitation of existing likelihood and Bayesian estimators when the populations are not independent. Last, we show that recent attempts to estimate absolute, rather than relative, mean coancestry fail to do so.


Subject(s)
Biological Evolution , Models, Genetic , Alleles , Bayes Theorem , Genetic Drift , Genetics, Population
5.
Forensic Sci Int Synerg ; 6: 100335, 2023.
Article in English | MEDLINE | ID: mdl-37325613

ABSTRACT

With the introduction of next generation sequencing (NGS) technology in the forensic field, it will be of interest to assess if forensic scientists feel equipped to interpret and present DNA evidence for sequence data. Here, we describe perceptions of sixteen U.S.-based forensic scientists on statistical models, sequence data, and ethical implications for DNA evidence evaluations. To get an in-depth understanding of the current situation, we used a qualitative research approach with a cross-sectional study design. Semi-structured interviews (N = 16) were conducted with U.S. forensic scientists working with DNA evidence. Open-ended interview questions were used to explore participants' views and needs surrounding the use of statistical models and sequence data for forensic purposes. We conducted a conventional content analysis using ATLAS. ti software and employed a second coder to ensure reliability of our results. Eleven themes emerged: 1) a statistical model that maximizes the value of the evidence is preferred; 2) a high-level understanding of the statistical model used is generally sufficient; 3) transparency is key in minimizing the risk of creating black boxes; 4) training and education should be an ongoing effort; 5) the effectiveness of presenting results in court can be improved; 6) NGS has the potential to become revolutionary; 7) some hesitations surrounding the use of sequence data remain; 8) there is a need for a concrete plan to alleviate barriers to the implementation of sequencing techniques; 9) ethics plays a major part in the role of a forensic scientist; 10) ethical barriers for sequence data depend on the application; 11) DNA evidence has its limitations. The results of this study give insight into the perceptions of forensic scientists regarding the use of statistical models and sequence data, providing valuable information in the move towards implementing sequencing methods for DNA evidence evaluations.

6.
Nat Genet ; 54(7): 934-939, 2022 07.
Article in English | MEDLINE | ID: mdl-35817969

ABSTRACT

The quantitative geneticist W. G. ('Bill') Hill, awardee of the 2018 Darwin Medal of the Royal Society and the 2019 Mendel Medal of the Genetics Society (United Kingdom), died on 17 December 2021 at the age of 81 years. Here, we pay tribute to his multiple key scientific contributions, which span population and evolutionary genetics, animal and plant breeding and human genetics. We discuss his theoretical research on the role of linkage disequilibrium (LD) and mutational variance in the response to selection, the origin of the widely used LD metric r2 in genomic association studies, the genetic architecture of complex traits, the quantification of the variation in realized relationships given a pedigree relationship and much more. We demonstrate that basic theoretical research in quantitative and statistical genetics has led to profound insights into the genetics and evolution of complex traits and made predictions that were subsequently empirically validated, often decades later.


Subject(s)
Genome , Plant Breeding , Animals , Genome-Wide Association Study , Genomics , Humans , Linkage Disequilibrium
8.
Philos Trans R Soc Lond B Biol Sci ; 377(1852): 20200420, 2022 06 06.
Article in English | MEDLINE | ID: mdl-35430892

ABSTRACT

In his 1972 paper 'The apportionment of human diversity', Lewontin showed that, when averaged over loci, genetic diversity is predominantly attributable to differences among individuals within populations. However, selection can alter the apportionment of diversity of specific genes or genomic regions. We examine genetic diversity at the human leucocyte antigen (HLA) loci, located within the major histocompatibility complex (MHC) region. HLA genes code for proteins that are critical to adaptive immunity and are well-documented targets of balancing selection. The single-nucleotide polymorphisms (SNPs) within HLA genes show strong signatures of balancing selection on large timescales and are broadly shared among populations, displaying low FST values. However, when we analyse haplotypes defined by these SNPs (which define 'HLA alleles'), we find marked differences in frequencies between geographic regions. These differences are not reflected in the FST values because of the extreme polymorphism at HLA loci, illustrating challenges in interpreting FST. Differences in the frequency of HLA alleles among geographic regions are relevant to bone-marrow transplantation, which requires genetic identity at HLA loci between patient and donor. We discuss the case of Brazil's bone marrow registry, where a deficit of enrolled volunteers with African ancestry reduces the chance of finding donors for individuals with an MHC region of African ancestry. This article is part of the theme issue 'Celebrating 50 years since Lewontin's apportionment of human diversity'.


Subject(s)
Major Histocompatibility Complex , Polymorphism, Single Nucleotide , Alleles , Gene Frequency , Haplotypes , Humans , Major Histocompatibility Complex/genetics
9.
Forensic Sci Int Genet ; 58: 102680, 2022 05.
Article in English | MEDLINE | ID: mdl-35313226

ABSTRACT

The Hardy-Weinberg law is shown to be transitive in the sense that a multi-allelic polymorphism that is in equilibrium will retain its equilibrium status if any allele together with its corresponding genotypes is deleted from the population. Similarly, the transitivity principle also applies if alleles are joined, which leads to the summation of allele frequencies and their corresponding genotype frequencies. These basic polymorphism properties are intuitive, but they have apparently not been formalized or investigated. This article provides a straightforward proof of the transitivity principle, and its usefulness in genetic data analysis is explored, using high-quality autosomal microsatellite databases from the US National Institute of Standards and Technology. We address the reduction of multi-allelic polymorphisms to variants with fewer alleles, two in the limit. Equilibrium test results obtained with the original and reduced polymorphisms are generally observed to be coherent, in particular when results obtained with length-based and sequence-based microsatellites are compared. We exploit the transitivity principle in order to identify disequilibrium-related alleles, and show its usefulness for detecting population substructure and genotyping problems that relate to null alleles and allele imbalance.


Subject(s)
Polymorphism, Genetic , Alleles , Gene Frequency , Genotype , Humans
10.
Nat Genet ; 54(3): 263-273, 2022 03.
Article in English | MEDLINE | ID: mdl-35256806

ABSTRACT

Analyses of data from genome-wide association studies on unrelated individuals have shown that, for human traits and diseases, approximately one-third to two-thirds of heritability is captured by common SNPs. However, it is not known whether the remaining heritability is due to the imperfect tagging of causal variants by common SNPs, in particular whether the causal variants are rare, or whether it is overestimated due to bias in inference from pedigree data. Here we estimated heritability for height and body mass index (BMI) from whole-genome sequence data on 25,465 unrelated individuals of European ancestry. The estimated heritability was 0.68 (standard error 0.10) for height and 0.30 (standard error 0.10) for body mass index. Low minor allele frequency variants in low linkage disequilibrium (LD) with neighboring variants were enriched for heritability, to a greater extent for protein-altering variants, consistent with negative selection. Our results imply that rare variants, in particular those in regions of low linkage disequilibrium, are a major source of the still missing heritability of complex traits and disease.


Subject(s)
Genome-Wide Association Study , Multifactorial Inheritance , Alleles , Genome-Wide Association Study/methods , Humans , Linkage Disequilibrium , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide/genetics
11.
Nat Hum Behav ; 6(3): 371-382, 2022 03.
Article in English | MEDLINE | ID: mdl-35165434

ABSTRACT

Transnational ivory traffickers continue to smuggle large shipments of elephant ivory out of Africa, yet prosecutions and convictions remain few. We identify trafficking networks on the basis of genetic matching of tusks from the same individual or close relatives in separate shipments. Analyses are drawn from 4,320 savannah (Loxodonta africana) and forest (L. cyclotis) elephant tusks, sampled from 49 large ivory seizures totalling 111 t, shipped out of Africa between 2002 and 2019. Network analyses reveal a repeating pattern wherein tusks from the same individual or close relatives are found in separate seizures that were containerized in, and transited through, common African ports. Results suggest that individual traffickers are exporting dozens of shipments, with considerable connectivity between traffickers operating in different ports. These tools provide a framework to combine evidence from multiple investigations, strengthen prosecutions and support indictment and prosecution of transnational ivory traffickers for the totality of their crimes.


Subject(s)
Elephants , Africa , Animals , Conservation of Natural Resources , Crime , Elephants/genetics , Genotype , Humans
12.
Article in English | MEDLINE | ID: mdl-38077656

ABSTRACT

A new calculation module within the PopStats module of the CODIS software package, based on the underlying mathematics presented in the MixKin software package, has been developed for assigning the Likelihood Ratio (LR) of DNA mixture profiles. This module uses a semi-continuous model that allows for population structure and allelic drop-out and drop-in but does not require allelic peak heights or other laboratory-specific parameters. This new implementation (named SC Mixture), like MixKin, does not specify or estimate a probability of drop-out. Instead, each contributor to a mixture has an independent drop-out rate, and the probability of the mixture profile for a specified proposition concerning the contributors is integrated over the range of possible drop-out rates. The allelic drop-in rate and the population structure parameter, theta, used by the software are specified by the user. The user can examine up to five contributors to a mixture, however, conditioning on assumed contributors and limiting the number of unknowns in both numerator and denominator hypotheses greatly improves performance. We report results from an extensive validation study performed for ten mixtures with each of one (single source), two, three, four, or five contributors, with four combinations of drop-in rate and a population structure parameter. Each mixture was run as a complete profile or with the random removal of alleles to simulate drop-out. All 1620 combinations were evaluated with PopStats, MixKin, and LRmix and considerable consistency was found among the results with all three packages.

14.
Heredity (Edinb) ; 128(1): 1-10, 2022 01.
Article in English | MEDLINE | ID: mdl-34824382

ABSTRACT

The two alleles an individual carries at a locus are identical by descent (ibd) if they have descended from a single ancestral allele in a reference population, and the probability of such identity is the inbreeding coefficient of the individual. Inbreeding coefficients can be predicted from pedigrees with founders constituting the reference population, but estimation from genetic data is not possible without data from the reference population. Most inbreeding estimators that make explicit use of sample allele frequencies as estimates of allele probabilities in the reference population are confounded by average kinships with other individuals. This means that the ranking of those estimates depends on the scope of the study sample and we show the variation in rankings for common estimators applied to different subdivisions of 1000 Genomes data. Allele-sharing estimators of within-population inbreeding relative to average kinship in a study sample, however, do have invariant rankings across all studies including those individuals. They are unbiased with a large number of SNPs. We discuss how allele sharing estimates are the relevant quantities for a range of empirical applications.


Subject(s)
Inbreeding , Polymorphism, Single Nucleotide , Alleles , Gene Frequency , Humans , Models, Genetic , Pedigree
15.
Front Immunol ; 11: 584950, 2020.
Article in English | MEDLINE | ID: mdl-33240273

ABSTRACT

A match of HLA loci between patients and donors is critical for successful hematopoietic stem cell transplantation. However, the extreme polymorphism of HLA loci - an outcome of millions of years of natural selection - reduces the chances that two individuals will carry identical combinations of multilocus HLA genotypes. Further, HLA variability is not homogeneously distributed throughout the world: African populations on average have greater variability than non-Africans, reducing the chances that two unrelated African individuals are HLA identical. Here, we explore how self-identification (often equated with "ethnicity" or "race") and genetic ancestry are related to the chances of finding HLA compatible donors in a large sample from Brazil, a highly admixed country. We query REDOME, Brazil's Bone Marrow Registry, and investigate how different criteria for identifying ancestry influence the chances of finding a match. We find that individuals who self-identify as "Black" and "Mixed" on average have lower chances of finding matches than those who self-identify as "White" (up to 57% reduction). We next show that an individual's African genetic ancestry, estimated using molecular markers and quantified as the proportion of an individual's genome that traces its ancestry to Africa, is strongly associated with reduced chances of finding a match (up to 60% reduction). Finally, we document that the strongest reduction in chances of finding a match is associated with having an MHC region of exclusively African ancestry (up to 75% reduction). We apply our findings to a specific condition, for which there is a clinical indication for transplantation: sickle-cell disease. We show that the increased African ancestry in patients with this disease leads to reduced chances of finding a match, when compared to the remainder of the sample, without the condition. Our results underscore the influence of ancestry on chances of finding compatible HLA matches, and indicate that efforts guided to increasing the African component of registries are necessary.


Subject(s)
Anemia, Sickle Cell/genetics , Black People/genetics , Bone Marrow/surgery , Bone Marrow Transplantation/methods , Brazil , Ethnicity/genetics , Gene Frequency/genetics , Genotype , HLA Antigens/genetics , Hematopoietic Stem Cell Transplantation/methods , Histocompatibility Testing/methods , Humans , Polymorphism, Genetic/genetics , Registries , Unrelated Donors , White People/genetics
16.
Forensic Sci Int Genet ; 49: 102364, 2020 11.
Article in English | MEDLINE | ID: mdl-32805606

ABSTRACT

Match probabilities calculated during the evaluation of DNA evidence profiles rely on appropriate values of the population structure quantity θ. NGS-based methods will enhance forensic identification and with the transformation to such methods comes the need to facilitate NGS-based population genetics analysis. If NGS data are to be used for match probabilities there needs to be a way to accommodate population structure, which requires values for θ for those data. Such estimates have not been available. This study assesses population structure for sequence-based data using a relatively new approach applied to STR data over 27 loci in five different geographic groups. Matching proportions between individuals or groups are used to obtain locus-specific θ estimates as well as estimates per geographic group and a global measure. The results demonstrate similar effects of sequencing data on θ estimates compared to what has been seen for CE-based results.


Subject(s)
Genetic Markers , Genetics, Population , High-Throughput Nucleotide Sequencing , Microsatellite Repeats , Sequence Analysis, DNA , Alleles , DNA Fingerprinting , Genotype , Humans , Racial Groups/genetics
17.
Genetics ; 212(4): 955-957, 2019 08.
Article in English | MEDLINE | ID: mdl-31405996

ABSTRACT

The Elizabeth W. Jones Award for Excellence in Education recognizes an individual or group that has had significant, sustained impact on genetics education at any level, from K-12 through graduate school and beyond. Bruce Weir (University of Washington) is the 2019 recipient in recognition of his work training thousands of researchers in the rigorous use of statistical analysis methods for genetic and genomic data. His contributions fall into three categories: the acclaimed Summer Institute in Statistical Genetics, which has been held continuously for 23 years and has trained > 10,000 researchers worldwide; the popular graduate-level textbook Genetic Data Analysis; and the training of a growing number of forensic geneticists during the rise of DNA evidence in courts around the world.


Subject(s)
Awards and Prizes , Genetics/history , Statistics as Topic/history , History, 21st Century , United States
18.
Theor Popul Biol ; 128: 19-26, 2019 08.
Article in English | MEDLINE | ID: mdl-31145877

ABSTRACT

The linkage disequilibrium coefficient r2 is a measure of statistical dependence of the alleles possessed by an individual at different genetic loci. It is widely used in association studies to search for the locations of disease-causing genes on chromosomes. Most studies to date treat r2 as a fixed property of two loci in a finite population, and investigate the sampling distribution of estimators due to the statistical sampling of individuals from the population. Here, we instead consider the distribution of r2 itself under a process of genetic sampling through the generations. Using a classical two-locus model for genetic drift, mutation, and recombination, we investigate the probability density function of r2 at stationarity. This density function provides a tool for inference on evolutionary parameters such as mutation and recombination rates. We reconstruct the approximate stationary density of r2 by calculating a finite sequence of the distribution's moments and applying the maximum entropy principle. Our approach is based on the diffusion approximation, under which we demonstrate that for certain models in population genetics, moments of the stationary distribution can be obtained without knowing the probability distribution itself. To illustrate our approach, we show how the stationary probability density of r2 can be used in a maximum likelihood framework to estimate mutation and recombination rates from sample data of r2.


Subject(s)
Linkage Disequilibrium , Models, Statistical , Algorithms , Alleles , Genetic Loci , Genetics, Population
19.
Sci Adv ; 4(9): eaat0625, 2018 09.
Article in English | MEDLINE | ID: mdl-30255141

ABSTRACT

Rapid growth in world trade has enabled transnational criminal networks to conceal their contraband among the 1 billion containers shipped worldwide annually. Forensic methods are needed to identify the major cartels moving the contraband into transit. We combine DNA-based sample matching and geographic assignment of tusks to show that the two tusks from the same elephant are often shipped by the same trafficker in separate large consignments of ivory. The paired shipments occur close in time from the same initial place of export and have high overlap in the geographic origins of their tusks. Collectively, these paired shipments form a linked chain that reflects the sizes, interconnectedness, and places of operation of Africa's largest ivory smuggling cartels.

20.
Mol Ecol ; 27(20): 4121-4135, 2018 10.
Article in English | MEDLINE | ID: mdl-30107060

ABSTRACT

The concept of kinship permeates many domains of fundamental and applied biology ranging from social evolution to conservation science to quantitative and human genetics. Until recently, pedigrees were the gold standard to infer kinship, but the advent of next-generation sequencing and the availability of dense genetic markers in many species make it a good time to (re)evaluate the usefulness of genetic markers in this context. Using three published data sets where both pedigrees and markers are available, we evaluate two common and a new genetic estimator of kinship. We show discrepancies between pedigree values and marker estimates of kinship and explore via simulations the possible reasons for these. We find these discrepancies are attributable to two main sources: pedigree errors and heterogeneity in the origin of founders. We also show that our new marker-based kinship estimator has very good statistical properties and behaviour and is particularly well suited for situations where the source population is of small size, as will often be the case in conservation biology, and where high levels of kinship are expected, as is typical in social evolution studies.


Subject(s)
Genetics, Population/methods , Pedigree , Genetic Markers , Humans , Models, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL
...